|
|
>
> This issue comes up every month or so, serach a bit back through the
> newsgroups and you will find your question answered.
the conclusion on this issue in older threads?n
> I doubt that just improving the dot product will speed things up in any
> noticeable range at all.
Well, run POV in profiler and take a look where it's spending most of
it's time.
>
> By default double uses 64 bits on x86. And there are good reason to have
> this precision.
Yes, i'm sorry , i mixed it with 'long double'. It was a long time since
i programmed.
> This is taken from the AMD 3DNow SDK matrix (thus it is AMDs SIMD FPU
> extension, not Intels), but for this purpose it will be enough:
>
> ALIGN 32
> PUBLIC _a_dot_vect
> _a_dot_vect PROC
> movq mm0,[eax]
> movq mm3,[edx]
> movd mm1,[eax+8]
> movd mm2,[edx+8]
> pfmul mm0,mm3
> pfmul mm1,mm2
> pfacc mm0,mm0
> pfadd mm0,mm1
> ret
> _a_dot_vect ENDP
Neat. Thanx. Unfortunately, i don't own AMD processor. I'll try to get
one of those Athlons tough.
Now, i'm not so assembler-skilled. How wide is mm0,1,2,3 register? Is
this done on 32-bit 'float' variables?
As far as i heard, Intels implementation of dot-product is even more
'automated' so you don't need to multiply registers 'by hand'. It's all
being done in one command.
> As you can see, making this change is rather trivial. The problems you will
> need two versions of POV-Ray, one for AMDs extension and for Intels.
Ahh...smallest problem.
> You do. Define DBL as float and watch POV-Ray "hang" in several functions
> because of the missing precision.
Note that this is not my idea of how this should be done. I would keep
all calculations as they are, and just rewrite dot-product funtion.
'double' would be converted into float prior to calculations and then
converted back.
Well, we'll never know if we never try, right?
Post a reply to this message
|
|